Dataset statistics
| Number of variables | 23 |
|---|---|
| Number of observations | 41188 |
| Missing cells | 38364 |
| Missing cells (%) | 4.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 7.2 MiB |
| Average record size in memory | 184.0 B |
Variable types
| Numeric | 11 |
|---|---|
| Categorical | 8 |
| Boolean | 4 |
i1 is highly correlated with i2 and 2 other fields | High correlation |
i2 is highly correlated with i1 | High correlation |
i4 is highly correlated with i1 and 1 other fields | High correlation |
i5 is highly correlated with i1 and 1 other fields | High correlation |
n4 is highly correlated with n6 | High correlation |
n6 is highly correlated with n4 | High correlation |
i1 is highly correlated with i2 and 2 other fields | High correlation |
i2 is highly correlated with i1 and 2 other fields | High correlation |
i4 is highly correlated with i1 and 2 other fields | High correlation |
i5 is highly correlated with i1 and 3 other fields | High correlation |
n4 is highly correlated with n6 | High correlation |
n6 is highly correlated with i5 and 1 other fields | High correlation |
i1 is highly correlated with i2 and 2 other fields | High correlation |
i2 is highly correlated with i1 | High correlation |
i4 is highly correlated with i1 and 1 other fields | High correlation |
i5 is highly correlated with i1 and 1 other fields | High correlation |
n4 is highly correlated with n6 | High correlation |
n6 is highly correlated with n4 | High correlation |
successful_sell is highly correlated with c10 | High correlation |
month is highly correlated with c4 | High correlation |
c4 is highly correlated with month | High correlation |
c10 is highly correlated with successful_sell | High correlation |
age is highly correlated with employment | High correlation |
c10 is highly correlated with c8 and 4 other fields | High correlation |
c4 is highly correlated with i1 and 5 other fields | High correlation |
c8 is highly correlated with c10 and 5 other fields | High correlation |
employment is highly correlated with age and 1 other fields | High correlation |
i1 is highly correlated with c4 and 5 other fields | High correlation |
i2 is highly correlated with c4 and 6 other fields | High correlation |
i3 is highly correlated with c10 and 9 other fields | High correlation |
i4 is highly correlated with c10 and 10 other fields | High correlation |
i5 is highly correlated with c10 and 8 other fields | High correlation |
month is highly correlated with c4 and 5 other fields | High correlation |
n4 is highly correlated with c8 and 4 other fields | High correlation |
n6 is highly correlated with i4 and 1 other fields | High correlation |
school is highly correlated with employment | High correlation |
successful_sell is highly correlated with c10 and 4 other fields | High correlation |
b2 has 990 (2.4%) missing values | Missing |
c8 has 35563 (86.3%) missing values | Missing |
school has 1731 (4.2%) missing values | Missing |
n5 has unique values | Unique |
n6 has 35563 (86.3%) zeros | Zeros |
Reproduction
| Analysis started | 2021-10-04 02:28:47.693261 |
|---|---|
| Analysis finished | 2021-10-04 02:29:11.159366 |
| Duration | 23.47 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
| Distinct | 78 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 40.02406041 |
| Minimum | 17 |
|---|---|
| Maximum | 98 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 321.9 KiB |
Quantile statistics
| Minimum | 17 |
|---|---|
| 5-th percentile | 26 |
| Q1 | 32 |
| median | 38 |
| Q3 | 47 |
| 95-th percentile | 58 |
| Maximum | 98 |
| Range | 81 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 10.42124998 |
|---|---|
| Coefficient of variation (CV) | 0.2603746315 |
| Kurtosis | 0.7913115312 |
| Mean | 40.02406041 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 0.7846968158 |
| Sum | 1648511 |
| Variance | 108.6024512 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 31 | 1947 | 4.7% |
| 32 | 1846 | 4.5% |
| 33 | 1833 | 4.5% |
| 36 | 1780 | 4.3% |
| 35 | 1759 | 4.3% |
| 34 | 1745 | 4.2% |
| 30 | 1714 | 4.2% |
| 37 | 1475 | 3.6% |
| 29 | 1453 | 3.5% |
| 39 | 1432 | 3.5% |
| Other values (68) | 24204 |
| Value | Count | Frequency (%) |
| 17 | 5 | < 0.1% |
| 18 | 28 | 0.1% |
| 19 | 42 | 0.1% |
| 20 | 65 | 0.2% |
| 21 | 102 | 0.2% |
| 22 | 137 | 0.3% |
| 23 | 226 | 0.5% |
| 24 | 463 | |
| 25 | 598 | |
| 26 | 698 |
| Value | Count | Frequency (%) |
| 98 | 2 | < 0.1% |
| 95 | 1 | < 0.1% |
| 94 | 1 | < 0.1% |
| 92 | 4 | < 0.1% |
| 91 | 2 | < 0.1% |
| 89 | 2 | < 0.1% |
| 88 | 22 | |
| 87 | 1 | < 0.1% |
| 86 | 8 | < 0.1% |
| 85 | 15 |
b1
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 321.9 KiB |
| yes | |
|---|---|
| no | |
| -1 | 990 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 2.523841896 |
| Min length | 2 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | yes |
|---|---|
| 2nd row | yes |
| 3rd row | no |
| 4th row | yes |
| 5th row | no |
Common Values
| Value | Count | Frequency (%) |
| yes | 21576 | |
| no | 18622 | |
| -1 | 990 | 2.4% |
Length
Pie chart
| Value | Count | Frequency (%) |
| yes | 21576 | |
| no | 18622 | |
| 1 | 990 | 2.4% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 990 |
| Missing (%) | 2.4% |
| Memory size | 80.6 KiB |
| False | |
|---|---|
| True | |
| (Missing) | 990 |
| Value | Count | Frequency (%) |
| False | 33950 | |
| True | 6248 | 15.2% |
| (Missing) | 990 | 2.4% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 40.3 KiB |
| False | |
|---|---|
| True |
| Value | Count | Frequency (%) |
| False | 36548 | |
| True | 4640 | 11.3% |
c3
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 321.9 KiB |
| False | |
|---|---|
| unknown | |
| True | 3 |
Length
| Max length | 7 |
|---|---|
| Median length | 5 |
| Mean length | 5.417378848 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | False |
|---|---|
| 2nd row | False |
| 3rd row | unknown |
| 4th row | False |
| 5th row | unknown |
Common Values
| Value | Count | Frequency (%) |
| False | 32588 | |
| unknown | 8597 | 20.9% |
| True | 3 | < 0.1% |
Length
Pie chart
| Value | Count | Frequency (%) |
| false | 32588 | |
| unknown | 8597 | 20.9% |
| true | 3 | < 0.1% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 321.9 KiB |
| new | |
|---|---|
| old |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | new |
|---|---|
| 2nd row | new |
| 3rd row | new |
| 4th row | new |
| 5th row | new |
Common Values
| Value | Count | Frequency (%) |
| new | 26144 | |
| old | 15044 |
Length
Pie chart
| Value | Count | Frequency (%) |
| new | 26144 | |
| old | 15044 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 35563 |
| Missing (%) | 86.3% |
| Memory size | 80.6 KiB |
| False | |
|---|---|
| True | 1373 |
| (Missing) |
| Value | Count | Frequency (%) |
| False | 4252 | 10.3% |
| True | 1373 | 3.3% |
| (Missing) | 35563 |
dow
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 321.9 KiB |
| thu | |
|---|---|
| mon | |
| wed | |
| tue | |
| fri |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | fri |
|---|---|
| 2nd row | thu |
| 3rd row | tue |
| 4th row | mon |
| 5th row | tue |
Common Values
| Value | Count | Frequency (%) |
| thu | 8623 | |
| mon | 8514 | |
| wed | 8134 | |
| tue | 8090 | |
| fri | 7827 |
Length
Pie chart
| Value | Count | Frequency (%) |
| thu | 8623 | |
| mon | 8514 | |
| wed | 8134 | |
| tue | 8090 | |
| fri | 7827 |
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 321.9 KiB |
| assistant | |
|---|---|
| laborer | |
| engineer | |
| customer service | |
| management | |
| Other values (7) |
Length
| Max length | 16 |
|---|---|
| Median length | 8 |
| Mean length | 8.918519957 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | management |
|---|---|
| 2nd row | assistant |
| 3rd row | leisure |
| 4th row | assistant |
| 5th row | assistant |
Common Values
| Value | Count | Frequency (%) |
| assistant | 10422 | |
| laborer | 9254 | |
| engineer | 6743 | |
| customer service | 3969 | 9.6% |
| management | 2924 | 7.1% |
| leisure | 1720 | 4.2% |
| hobbyist | 1456 | 3.5% |
| self-employed | 1421 | 3.5% |
| cleaner | 1060 | 2.6% |
| none | 1014 | 2.5% |
| Other values (2) | 1205 | 2.9% |
Length
| Value | Count | Frequency (%) |
| assistant | 10422 | |
| laborer | 9254 | |
| engineer | 6743 | |
| customer | 3969 | 8.8% |
| service | 3969 | 8.8% |
| management | 2924 | 6.5% |
| leisure | 1720 | 3.8% |
| hobbyist | 1456 | 3.2% |
| self-employed | 1421 | 3.1% |
| cleaner | 1060 | 2.3% |
| Other values (3) | 2219 | 4.9% |
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.08188550063 |
| Minimum | -3.4 |
|---|---|
| Maximum | 1.4 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 17191 |
| Negative (%) | 41.7% |
| Memory size | 321.9 KiB |
Quantile statistics
| Minimum | -3.4 |
|---|---|
| 5-th percentile | -2.9 |
| Q1 | -1.8 |
| median | 1.1 |
| Q3 | 1.4 |
| 95-th percentile | 1.4 |
| Maximum | 1.4 |
| Range | 4.8 |
| Interquartile range (IQR) | 3.2 |
Descriptive statistics
| Standard deviation | 1.570959741 |
|---|---|
| Coefficient of variation (CV) | 19.18483405 |
| Kurtosis | -1.062631525 |
| Mean | 0.08188550063 |
| Median Absolute Deviation (MAD) | 0.3 |
| Skewness | -0.7240955492 |
| Sum | 3372.7 |
| Variance | 2.467914506 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1.4 | 16234 | |
| -1.8 | 9184 | |
| 1.1 | 7763 | |
| -0.1 | 3683 | 8.9% |
| -2.9 | 1663 | 4.0% |
| -3.4 | 1071 | 2.6% |
| -1.7 | 773 | 1.9% |
| -1.1 | 635 | 1.5% |
| -3 | 172 | 0.4% |
| -0.2 | 10 | < 0.1% |
| Value | Count | Frequency (%) |
| -3.4 | 1071 | 2.6% |
| -3 | 172 | 0.4% |
| -2.9 | 1663 | 4.0% |
| -1.8 | 9184 | |
| -1.7 | 773 | 1.9% |
| -1.1 | 635 | 1.5% |
| -0.2 | 10 | < 0.1% |
| -0.1 | 3683 | 8.9% |
| 1.1 | 7763 | |
| 1.4 | 16234 |
| Value | Count | Frequency (%) |
| 1.4 | 16234 | |
| 1.1 | 7763 | |
| -0.1 | 3683 | 8.9% |
| -0.2 | 10 | < 0.1% |
| -1.1 | 635 | 1.5% |
| -1.7 | 773 | 1.9% |
| -1.8 | 9184 | |
| -2.9 | 1663 | 4.0% |
| -3 | 172 | 0.4% |
| -3.4 | 1071 | 2.6% |
| Distinct | 26 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 93.57566437 |
| Minimum | 92.201 |
|---|---|
| Maximum | 94.767 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 321.9 KiB |
Quantile statistics
| Minimum | 92.201 |
|---|---|
| 5-th percentile | 92.713 |
| Q1 | 93.075 |
| median | 93.749 |
| Q3 | 93.994 |
| 95-th percentile | 94.465 |
| Maximum | 94.767 |
| Range | 2.566 |
| Interquartile range (IQR) | 0.919 |
Descriptive statistics
| Standard deviation | 0.578840049 |
|---|---|
| Coefficient of variation (CV) | 0.00618579684 |
| Kurtosis | -0.8298085772 |
| Mean | 93.57566437 |
| Median Absolute Deviation (MAD) | 0.38 |
| Skewness | -0.2308876514 |
| Sum | 3854194.464 |
| Variance | 0.3350558023 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 93.994 | 7763 | |
| 93.918 | 6685 | |
| 92.893 | 5794 | |
| 93.444 | 5175 | |
| 94.465 | 4374 | |
| 93.2 | 3616 | |
| 93.075 | 2458 | 6.0% |
| 92.201 | 770 | 1.9% |
| 92.963 | 715 | 1.7% |
| 92.431 | 447 | 1.1% |
| Other values (16) | 3391 |
| Value | Count | Frequency (%) |
| 92.201 | 770 | 1.9% |
| 92.379 | 267 | 0.6% |
| 92.431 | 447 | 1.1% |
| 92.469 | 178 | 0.4% |
| 92.649 | 357 | 0.9% |
| 92.713 | 172 | 0.4% |
| 92.756 | 10 | < 0.1% |
| 92.843 | 282 | 0.7% |
| 92.893 | 5794 | |
| 92.963 | 715 | 1.7% |
| Value | Count | Frequency (%) |
| 94.767 | 128 | 0.3% |
| 94.601 | 204 | 0.5% |
| 94.465 | 4374 | |
| 94.215 | 311 | 0.8% |
| 94.199 | 303 | 0.7% |
| 94.055 | 229 | 0.6% |
| 94.027 | 233 | 0.6% |
| 93.994 | 7763 | |
| 93.918 | 6685 | |
| 93.876 | 212 | 0.5% |
| Distinct | 26 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -40.50260027 |
| Minimum | -50.8 |
|---|---|
| Maximum | -26.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 41188 |
| Negative (%) | 100.0% |
| Memory size | 321.9 KiB |
Quantile statistics
| Minimum | -50.8 |
|---|---|
| 5-th percentile | -47.1 |
| Q1 | -42.7 |
| median | -41.8 |
| Q3 | -36.4 |
| 95-th percentile | -33.6 |
| Maximum | -26.9 |
| Range | 23.9 |
| Interquartile range (IQR) | 6.3 |
Descriptive statistics
| Standard deviation | 4.628197856 |
|---|---|
| Coefficient of variation (CV) | -0.1142691537 |
| Kurtosis | -0.3585583105 |
| Mean | -40.50260027 |
| Median Absolute Deviation (MAD) | 4.4 |
| Skewness | 0.3031798587 |
| Sum | -1668221.1 |
| Variance | 21.4202154 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -36.4 | 7763 | |
| -42.7 | 6685 | |
| -46.2 | 5794 | |
| -36.1 | 5175 | |
| -41.8 | 4374 | |
| -42 | 3616 | |
| -47.1 | 2458 | 6.0% |
| -31.4 | 770 | 1.9% |
| -40.8 | 715 | 1.7% |
| -26.9 | 447 | 1.1% |
| Other values (16) | 3391 |
| Value | Count | Frequency (%) |
| -50.8 | 128 | 0.3% |
| -50 | 282 | 0.7% |
| -49.5 | 204 | 0.5% |
| -47.1 | 2458 | 6.0% |
| -46.2 | 5794 | |
| -45.9 | 10 | < 0.1% |
| -42.7 | 6685 | |
| -42 | 3616 | |
| -41.8 | 4374 | |
| -40.8 | 715 | 1.7% |
| Value | Count | Frequency (%) |
| -26.9 | 447 | 1.1% |
| -29.8 | 267 | 0.6% |
| -30.1 | 357 | 0.9% |
| -31.4 | 770 | 1.9% |
| -33 | 172 | 0.4% |
| -33.6 | 178 | 0.4% |
| -34.6 | 174 | 0.4% |
| -34.8 | 264 | 0.6% |
| -36.1 | 5175 | |
| -36.4 | 7763 |
| Distinct | 316 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.621290813 |
| Minimum | 0.634 |
|---|---|
| Maximum | 5.045 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 321.9 KiB |
Quantile statistics
| Minimum | 0.634 |
|---|---|
| 5-th percentile | 0.797 |
| Q1 | 1.344 |
| median | 4.857 |
| Q3 | 4.961 |
| 95-th percentile | 4.966 |
| Maximum | 5.045 |
| Range | 4.411 |
| Interquartile range (IQR) | 3.617 |
Descriptive statistics
| Standard deviation | 1.734447405 |
|---|---|
| Coefficient of variation (CV) | 0.4789583313 |
| Kurtosis | -1.406802622 |
| Mean | 3.621290813 |
| Median Absolute Deviation (MAD) | 0.108 |
| Skewness | -0.7091879564 |
| Sum | 149153.726 |
| Variance | 3.0083078 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4.857 | 2868 | 7.0% |
| 4.962 | 2613 | 6.3% |
| 4.963 | 2487 | 6.0% |
| 4.961 | 1902 | 4.6% |
| 4.856 | 1210 | 2.9% |
| 4.964 | 1175 | 2.9% |
| 1.405 | 1169 | 2.8% |
| 4.965 | 1071 | 2.6% |
| 4.864 | 1044 | 2.5% |
| 4.96 | 1013 | 2.5% |
| Other values (306) | 24636 |
| Value | Count | Frequency (%) |
| 0.634 | 8 | < 0.1% |
| 0.635 | 43 | |
| 0.636 | 14 | < 0.1% |
| 0.637 | 6 | < 0.1% |
| 0.638 | 7 | < 0.1% |
| 0.639 | 16 | < 0.1% |
| 0.64 | 10 | < 0.1% |
| 0.642 | 35 | |
| 0.643 | 23 | |
| 0.644 | 38 |
| Value | Count | Frequency (%) |
| 5.045 | 9 | < 0.1% |
| 5 | 7 | < 0.1% |
| 4.97 | 172 | 0.4% |
| 4.968 | 992 | 2.4% |
| 4.967 | 643 | 1.6% |
| 4.966 | 622 | 1.5% |
| 4.965 | 1071 | |
| 4.964 | 1175 | |
| 4.963 | 2487 | |
| 4.962 | 2613 |
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5167.035911 |
| Minimum | 4963.6 |
|---|---|
| Maximum | 5228.1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 321.9 KiB |
Quantile statistics
| Minimum | 4963.6 |
|---|---|
| 5-th percentile | 5017.5 |
| Q1 | 5099.1 |
| median | 5191 |
| Q3 | 5228.1 |
| 95-th percentile | 5228.1 |
| Maximum | 5228.1 |
| Range | 264.5 |
| Interquartile range (IQR) | 129 |
Descriptive statistics
| Standard deviation | 72.25152767 |
|---|---|
| Coefficient of variation (CV) | 0.01398316732 |
| Kurtosis | -0.003760375696 |
| Mean | 5167.035911 |
| Median Absolute Deviation (MAD) | 37.1 |
| Skewness | -1.044262407 |
| Sum | 212819875.1 |
| Variance | 5220.28325 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5228.1 | 16234 | |
| 5099.1 | 8534 | |
| 5191 | 7763 | |
| 5195.8 | 3683 | 8.9% |
| 5076.2 | 1663 | 4.0% |
| 5017.5 | 1071 | 2.6% |
| 4991.6 | 773 | 1.9% |
| 5008.7 | 650 | 1.6% |
| 4963.6 | 635 | 1.5% |
| 5023.5 | 172 | 0.4% |
| Value | Count | Frequency (%) |
| 4963.6 | 635 | 1.5% |
| 4991.6 | 773 | 1.9% |
| 5008.7 | 650 | 1.6% |
| 5017.5 | 1071 | 2.6% |
| 5023.5 | 172 | 0.4% |
| 5076.2 | 1663 | 4.0% |
| 5099.1 | 8534 | |
| 5176.3 | 10 | < 0.1% |
| 5191 | 7763 | |
| 5195.8 | 3683 |
| Value | Count | Frequency (%) |
| 5228.1 | 16234 | |
| 5195.8 | 3683 | 8.9% |
| 5191 | 7763 | |
| 5176.3 | 10 | < 0.1% |
| 5099.1 | 8534 | |
| 5076.2 | 1663 | 4.0% |
| 5023.5 | 172 | 0.4% |
| 5017.5 | 1071 | 2.6% |
| 5008.7 | 650 | 1.6% |
| 4991.6 | 773 | 1.9% |
marriage-status
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 80 |
| Missing (%) | 0.2% |
| Memory size | 321.9 KiB |
| married | |
|---|---|
| single | |
| divorced |
Length
| Max length | 8 |
|---|---|
| Median length | 7 |
| Mean length | 6.830787195 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | divorced |
|---|---|
| 2nd row | divorced |
| 3rd row | married |
| 4th row | married |
| 5th row | married |
Common Values
| Value | Count | Frequency (%) |
| married | 24928 | |
| single | 11568 | |
| divorced | 4612 | 11.2% |
| (Missing) | 80 | 0.2% |
Length
Pie chart
| Value | Count | Frequency (%) |
| married | 24928 | |
| single | 11568 | |
| divorced | 4612 | 11.2% |
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 321.9 KiB |
| may | |
|---|---|
| jul | |
| aug | |
| jun | |
| nov | |
| Other values (5) |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | apr |
|---|---|
| 2nd row | may |
| 3rd row | jul |
| 4th row | nov |
| 5th row | jul |
Common Values
| Value | Count | Frequency (%) |
| may | 13769 | |
| jul | 7174 | |
| aug | 6178 | |
| jun | 5318 | 12.9% |
| nov | 4101 | 10.0% |
| apr | 2632 | 6.4% |
| oct | 718 | 1.7% |
| sep | 570 | 1.4% |
| mar | 546 | 1.3% |
| dec | 182 | 0.4% |
Length
Pie chart
| Value | Count | Frequency (%) |
| may | 13769 | |
| jul | 7174 | |
| aug | 6178 | |
| jun | 5318 | 12.9% |
| nov | 4101 | 10.0% |
| apr | 2632 | 6.4% |
| oct | 718 | 1.7% |
| sep | 570 | 1.4% |
| mar | 546 | 1.3% |
| dec | 182 | 0.4% |
n2
Real number (ℝ≥0)
| Distinct | 42 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.567592503 |
| Minimum | 1 |
|---|---|
| Maximum | 56 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 321.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 7 |
| Maximum | 56 |
| Range | 55 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 2.770013543 |
|---|---|
| Coefficient of variation (CV) | 1.078836903 |
| Kurtosis | 36.97979514 |
| Mean | 2.567592503 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 4.762506697 |
| Sum | 105754 |
| Variance | 7.672975028 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 17642 | |
| 2 | 10570 | |
| 3 | 5341 | 13.0% |
| 4 | 2651 | 6.4% |
| 5 | 1599 | 3.9% |
| 6 | 979 | 2.4% |
| 7 | 629 | 1.5% |
| 8 | 400 | 1.0% |
| 9 | 283 | 0.7% |
| 10 | 225 | 0.5% |
| Other values (32) | 869 | 2.1% |
| Value | Count | Frequency (%) |
| 1 | 17642 | |
| 2 | 10570 | |
| 3 | 5341 | 13.0% |
| 4 | 2651 | 6.4% |
| 5 | 1599 | 3.9% |
| 6 | 979 | 2.4% |
| 7 | 629 | 1.5% |
| 8 | 400 | 1.0% |
| 9 | 283 | 0.7% |
| 10 | 225 | 0.5% |
| Value | Count | Frequency (%) |
| 56 | 1 | < 0.1% |
| 43 | 2 | < 0.1% |
| 42 | 2 | < 0.1% |
| 41 | 1 | < 0.1% |
| 40 | 2 | < 0.1% |
| 39 | 1 | < 0.1% |
| 37 | 1 | < 0.1% |
| 35 | 5 | |
| 34 | 3 | |
| 33 | 4 |
n3
Real number (ℝ≥0)
| Distinct | 50 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 745.1420317 |
| Minimum | 500 |
|---|---|
| Maximum | 990 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 321.9 KiB |
Quantile statistics
| Minimum | 500 |
|---|---|
| 5-th percentile | 520 |
| Q1 | 620 |
| median | 750 |
| Q3 | 870 |
| 95-th percentile | 970 |
| Maximum | 990 |
| Range | 490 |
| Interquartile range (IQR) | 250 |
Descriptive statistics
| Standard deviation | 144.2461959 |
|---|---|
| Coefficient of variation (CV) | 0.1935821492 |
| Kurtosis | -1.201793581 |
| Mean | 745.1420317 |
| Median Absolute Deviation (MAD) | 120 |
| Skewness | 0.002266545452 |
| Sum | 30690910 |
| Variance | 20806.96504 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 770 | 913 | 2.2% |
| 660 | 902 | 2.2% |
| 630 | 873 | 2.1% |
| 980 | 866 | 2.1% |
| 940 | 862 | 2.1% |
| 680 | 862 | 2.1% |
| 610 | 858 | 2.1% |
| 500 | 855 | 2.1% |
| 700 | 850 | 2.1% |
| 920 | 845 | 2.1% |
| Other values (40) | 32502 |
| Value | Count | Frequency (%) |
| 500 | 855 | |
| 510 | 785 | |
| 520 | 812 | |
| 530 | 831 | |
| 540 | 788 | |
| 550 | 839 | |
| 560 | 791 | |
| 570 | 798 | |
| 580 | 831 | |
| 590 | 829 |
| Value | Count | Frequency (%) |
| 990 | 781 | |
| 980 | 866 | |
| 970 | 809 | |
| 960 | 825 | |
| 950 | 830 | |
| 940 | 862 | |
| 930 | 828 | |
| 920 | 845 | |
| 910 | 842 | |
| 900 | 838 |
| Distinct | 27 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 962.475454 |
| Minimum | 0 |
|---|---|
| Maximum | 999 |
| Zeros | 15 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 321.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 999 |
| Q1 | 999 |
| median | 999 |
| Q3 | 999 |
| 95-th percentile | 999 |
| Maximum | 999 |
| Range | 999 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 186.9109073 |
|---|---|
| Coefficient of variation (CV) | 0.194198103 |
| Kurtosis | 22.22946263 |
| Mean | 962.475454 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -4.922189916 |
| Sum | 39642439 |
| Variance | 34935.68728 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 999 | 39673 | |
| 3 | 439 | 1.1% |
| 6 | 412 | 1.0% |
| 4 | 118 | 0.3% |
| 9 | 64 | 0.2% |
| 2 | 61 | 0.1% |
| 7 | 60 | 0.1% |
| 12 | 58 | 0.1% |
| 10 | 52 | 0.1% |
| 5 | 46 | 0.1% |
| Other values (17) | 205 | 0.5% |
| Value | Count | Frequency (%) |
| 0 | 15 | < 0.1% |
| 1 | 26 | 0.1% |
| 2 | 61 | 0.1% |
| 3 | 439 | |
| 4 | 118 | 0.3% |
| 5 | 46 | 0.1% |
| 6 | 412 | |
| 7 | 60 | 0.1% |
| 8 | 18 | < 0.1% |
| 9 | 64 | 0.2% |
| Value | Count | Frequency (%) |
| 999 | 39673 | |
| 27 | 1 | < 0.1% |
| 26 | 1 | < 0.1% |
| 25 | 1 | < 0.1% |
| 22 | 3 | < 0.1% |
| 21 | 2 | < 0.1% |
| 20 | 1 | < 0.1% |
| 19 | 3 | < 0.1% |
| 18 | 7 | < 0.1% |
| 17 | 8 | < 0.1% |
| Distinct | 41188 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -7.401282414 × 10-5 |
| Minimum | -4.354231256 |
|---|---|
| Maximum | 4.547728948 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 20560 |
| Negative (%) | 49.9% |
| Memory size | 321.9 KiB |
Quantile statistics
| Minimum | -4.354231256 |
|---|---|
| 5-th percentile | -1.634697615 |
| Q1 | -0.6797248422 |
| median | 0.001356889153 |
| Q3 | 0.6733795808 |
| 95-th percentile | 1.641891473 |
| Maximum | 4.547728948 |
| Range | 8.901960204 |
| Interquartile range (IQR) | 1.353104423 |
Descriptive statistics
| Standard deviation | 0.9970236956 |
|---|---|
| Coefficient of variation (CV) | -13470.95868 |
| Kurtosis | 0.001034753214 |
| Mean | -7.401282414 × 10-5 |
| Median Absolute Deviation (MAD) | 0.6763812753 |
| Skewness | -0.001647365724 |
| Sum | -3.0484402 |
| Variance | 0.9940562496 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.00177117304 | 1 | < 0.1% |
| 0.4509224724 | 1 | < 0.1% |
| 0.06879677852 | 1 | < 0.1% |
| -0.3983256089 | 1 | < 0.1% |
| -1.301552237 | 1 | < 0.1% |
| -0.3729084964 | 1 | < 0.1% |
| -1.80999416 | 1 | < 0.1% |
| 2.201383552 | 1 | < 0.1% |
| 0.8080306096 | 1 | < 0.1% |
| -0.314169951 | 1 | < 0.1% |
| Other values (41178) | 41178 |
| Value | Count | Frequency (%) |
| -4.354231256 | 1 | |
| -4.006671358 | 1 | |
| -3.968292903 | 1 | |
| -3.861580784 | 1 | |
| -3.810956666 | 1 | |
| -3.774129379 | 1 | |
| -3.548822293 | 1 | |
| -3.54463679 | 1 | |
| -3.539784975 | 1 | |
| -3.45636241 | 1 |
| Value | Count | Frequency (%) |
| 4.547728948 | 1 | |
| 3.665093123 | 1 | |
| 3.62280555 | 1 | |
| 3.579895169 | 1 | |
| 3.568784989 | 1 | |
| 3.550962934 | 1 | |
| 3.495446235 | 1 | |
| 3.484466972 | 1 | |
| 3.407292524 | 1 | |
| 3.394613166 | 1 |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1729629989 |
| Minimum | 0 |
|---|---|
| Maximum | 7 |
| Zeros | 35563 |
| Zeros (%) | 86.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 321.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 7 |
| Range | 7 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.4949010798 |
|---|---|
| Coefficient of variation (CV) | 2.861311858 |
| Kurtosis | 20.10881622 |
| Mean | 0.1729629989 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.832042243 |
| Sum | 7124 |
| Variance | 0.2449270788 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 35563 | |
| 1 | 4561 | 11.1% |
| 2 | 754 | 1.8% |
| 3 | 216 | 0.5% |
| 4 | 70 | 0.2% |
| 5 | 18 | < 0.1% |
| 6 | 5 | < 0.1% |
| 7 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 35563 | |
| 1 | 4561 | 11.1% |
| 2 | 754 | 1.8% |
| 3 | 216 | 0.5% |
| 4 | 70 | 0.2% |
| 5 | 18 | < 0.1% |
| 6 | 5 | < 0.1% |
| 7 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 7 | 1 | < 0.1% |
| 6 | 5 | < 0.1% |
| 5 | 18 | < 0.1% |
| 4 | 70 | 0.2% |
| 3 | 216 | 0.5% |
| 2 | 754 | 1.8% |
| 1 | 4561 | 11.1% |
| 0 | 35563 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1731 |
| Missing (%) | 4.2% |
| Memory size | 321.9 KiB |
| 5 - a lot | |
|---|---|
| 4 - average amount | |
| 3 - a bit more | |
| 5 - a decent amount | |
| 1 - almost none | |
| Other values (2) |
Length
| Max length | 19 |
|---|---|
| Median length | 15 |
| Mean length | 14.30633348 |
| Min length | 8 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 5 - a decent amount |
|---|---|
| 2nd row | 5 - a lot |
| 3rd row | 2 - a little bit |
| 4th row | 5 - a lot |
| 5th row | 5 - a lot |
Common Values
| Value | Count | Frequency (%) |
| 5 - a lot | 12168 | |
| 4 - average amount | 9515 | |
| 3 - a bit more | 6045 | |
| 5 - a decent amount | 5243 | |
| 1 - almost none | 4176 | 10.1% |
| 2 - a little bit | 2292 | 5.6% |
| 0 - none | 18 | < 0.1% |
| (Missing) | 1731 | 4.2% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 39457 | ||
| a | 25748 | |
| 5 | 17411 | |
| amount | 14758 | 8.6% |
| lot | 12168 | 7.1% |
| 4 | 9515 | 5.6% |
| average | 9515 | 5.6% |
| bit | 8337 | 4.9% |
| 3 | 6045 | 3.5% |
| more | 6045 | 3.5% |
| Other values (7) | 22391 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| age | b1 | b2 | c10 | c3 | c4 | c8 | dow | employment | i1 | i2 | i3 | i4 | i5 | marriage-status | month | n2 | n3 | n4 | n5 | n6 | school | successful_sell | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 34 | yes | no | no | False | new | NaN | fri | management | -1.8 | 93.075 | -47.1 | 1.405 | 5099.1 | divorced | apr | 2 | 530 | 999 | 0.001771 | 0 | 5 - a decent amount | no |
| 1 | 28 | yes | no | yes | False | new | NaN | thu | assistant | -1.8 | 92.893 | -46.2 | 1.327 | 5099.1 | divorced | may | 1 | 750 | 999 | -1.673152 | 0 | 5 - a lot | yes |
| 2 | 55 | no | no | no | unknown | new | NaN | tue | leisure | 1.4 | 93.918 | -42.7 | 4.962 | 5228.1 | married | jul | 3 | 600 | 999 | 0.927946 | 0 | 2 - a little bit | no |
| 3 | 47 | yes | no | no | False | new | NaN | mon | assistant | -0.1 | 93.200 | -42.0 | 4.191 | 5195.8 | married | nov | 1 | 860 | 999 | 0.203013 | 0 | 5 - a lot | no |
| 4 | 49 | no | no | no | unknown | new | NaN | tue | assistant | 1.4 | 93.918 | -42.7 | 4.961 | 5228.1 | married | jul | 6 | 620 | 999 | 0.990804 | 0 | 5 - a lot | no |
| 5 | 48 | no | no | no | False | new | NaN | tue | engineer | -0.1 | 93.200 | -42.0 | 4.153 | 5195.8 | married | nov | 4 | 620 | 999 | -0.766658 | 0 | 5 - a lot | no |
| 6 | 30 | no | no | no | unknown | new | NaN | fri | customer service | -1.8 | 92.893 | -46.2 | 1.313 | 5099.1 | single | may | 2 | 530 | 999 | -0.552628 | 0 | 4 - average amount | no |
| 7 | 32 | yes | yes | no | False | new | NaN | thu | management | -2.9 | 92.963 | -40.8 | 1.260 | 5076.2 | married | jun | 2 | 900 | 999 | -0.159617 | 0 | 4 - average amount | no |
| 8 | 53 | yes | yes | no | False | new | NaN | thu | management | -1.8 | 93.749 | -34.6 | 0.640 | 5008.7 | divorced | apr | 6 | 660 | 999 | 0.140678 | 0 | 5 - a lot | no |
| 9 | 30 | no | no | no | False | old | NaN | mon | laborer | 1.1 | 93.994 | -36.4 | 4.857 | 5191.0 | married | may | 4 | 650 | 999 | 0.570528 | 0 | 4 - average amount | no |
Last rows
| age | b1 | b2 | c10 | c3 | c4 | c8 | dow | employment | i1 | i2 | i3 | i4 | i5 | marriage-status | month | n2 | n3 | n4 | n5 | n6 | school | successful_sell | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 41178 | 80 | no | yes | no | False | new | no | fri | leisure | -3.0 | 92.713 | -33.0 | 0.718 | 5023.5 | divorced | dec | 5 | 910 | 999 | 0.545113 | 1 | 1 - almost none | no |
| 41179 | 33 | no | yes | yes | False | new | no | wed | laborer | -1.8 | 92.893 | -46.2 | 1.281 | 5099.1 | married | may | 2 | 840 | 999 | -0.417591 | 1 | 1 - almost none | yes |
| 41180 | 24 | yes | no | no | False | new | NaN | tue | engineer | -2.9 | 92.963 | -40.8 | 1.262 | 5076.2 | single | jun | 2 | 780 | 999 | 0.774808 | 0 | 4 - average amount | no |
| 41181 | 32 | yes | yes | no | False | old | NaN | tue | management | -0.1 | 93.200 | -42.0 | 4.700 | 5195.8 | married | nov | 1 | 530 | 999 | -0.630333 | 0 | 5 - a lot | no |
| 41182 | 47 | yes | no | no | False | new | NaN | mon | customer service | 1.4 | 93.918 | -42.7 | 4.962 | 5228.1 | married | jul | 10 | 670 | 999 | 1.527868 | 0 | 2 - a little bit | no |
| 41183 | 33 | yes | no | no | False | old | NaN | mon | assistant | 1.4 | 94.465 | -41.8 | 4.865 | 5228.1 | married | jun | 3 | 620 | 999 | -0.050022 | 0 | 4 - average amount | no |
| 41184 | 36 | yes | no | no | False | old | NaN | mon | engineer | 1.4 | 94.465 | -41.8 | 4.961 | 5228.1 | married | jun | 1 | 650 | 999 | -2.310504 | 0 | 5 - a lot | no |
| 41185 | 36 | no | no | no | False | new | NaN | mon | engineer | 1.4 | 93.918 | -42.7 | 4.962 | 5228.1 | divorced | jul | 3 | 620 | 999 | 2.144238 | 0 | 5 - a decent amount | no |
| 41186 | 50 | no | no | no | False | old | NaN | fri | hobbyist | 1.4 | 94.465 | -41.8 | 4.959 | 5228.1 | married | jun | 2 | 880 | 999 | 0.359144 | 0 | 1 - almost none | no |
| 41187 | 28 | yes | no | no | False | old | NaN | tue | laborer | 1.1 | 93.994 | -36.4 | 4.857 | 5191.0 | married | may | 2 | 560 | 999 | 2.313130 | 0 | 2 - a little bit | no |